Measuring associational thinking through word embeddings
نویسندگان
چکیده
Abstract The development of a model to quantify semantic similarity and relatedness between words has been the major focus many studies in various fields, e.g. psychology, linguistics, natural language processing. Unlike measures proposed by most previous research, this article is aimed at estimating automatically strength associative that can be semantically related or not. We demonstrate performance depends not only on combination independently constructed word embeddings (namely, corpus- network-based embeddings) but also way these vectors interact. research concludes weighted average cosine-similarity coefficients derived from independent double vector space tends yield high correlations with human judgements. Moreover, we evaluating associations through measure relies rank ordering pairs reveal some findings go unnoticed traditional such as Spearman’s Pearson’s correlation coefficients.
منابع مشابه
Word Embeddings through Hellinger PCA
Word embeddings resulting from neural language models have been shown to be successful for a large variety of NLP tasks. However, such architecture might be difficult to train and time-consuming. Instead, we propose to drastically simplify the word embeddings computation through a Hellinger PCA of the word co-occurence matrix. We compare those new word embeddings with the Collobert and Weston (...
متن کاملMeasuring Topic Coherence through Optimal Word Buckets
Measuring topic quality is essential for scoring the learned topics and their subsequent use in Information Retrieval and Text classification. To measure quality of Latent Dirichlet Allocation (LDA) based topics learned from text, we propose a novel approach based on grouping of topic words into buckets (TBuckets). A single large bucket signifies a single coherent theme, in turn indicating high...
متن کاملCentroid-based Text Summarization through Compositionality of Word Embeddings
The textual similarity is a crucial aspect for many extractive text summarization methods. A bag-of-words representation does not allow to grasp the semantic relationships between concepts when comparing strongly related sentences with no words in common. To overcome this issue, in this paper we propose a centroidbased method for text summarization that exploits the compositional capabilities o...
متن کاملThe Geometry of Culture: Analyzing Meaning through Word Embeddings
We demonstrate the utility of a new methodological tool, neural-network word embedding models, for large-scale text analysis, revealing how these models produce richer insights into cultural associations and categories than possible with prior methods. Word embeddings represent semantic relations between words as geometric relationships between vectors in a high-dimensional space, operationaliz...
متن کاملWhat's in an Embedding? Analyzing Word Embeddings through Multilingual Evaluation
In the last two years, there has been a surge of word embedding algorithms and research on them. However, evaluation has mostly been carried out on a narrow set of tasks, mainly word similarity/relatedness and word relation similarity and on a single language, namely English. We propose an approach to evaluate embeddings on a variety of languages that also yields insights into the structure of ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Artificial Intelligence Review
سال: 2021
ISSN: ['0269-2821', '1573-7462']
DOI: https://doi.org/10.1007/s10462-021-10056-6